Low-Cost Support for Fine-Grain Synchronization in Multiprocessors
نویسندگان
چکیده
As multiprocessors scale beyond the limits of a few tens of processors, they must look beyond traditional methods of synchronization to minimize serialization and achieve the high degrees of parallelism required to utilize large machines. By allowing synchronization at the level of the smallest unit of memory, ne-grain synchronization achieves these goals. Unfortunately, supporting e cient ne-grain synchronization without inordinate amounts of hardware has remained a challenge. This paper describes the support for ne-grain synchronization provided by the Alewife system. The premise underlying Alewife's implementation is that successful synchronization attempts are the common case when serialization is minimized through word-level synchronization. For our applications, the failure rates were less than 7%. E ciency at low hardware cost is achieved by providing hardware support to streamline successful synchronization attempts and relegating other non-critical operations to software. Alewife provides a large synchronization name space by associating full/empty bits with each memory word. Successful synchronization attempts execute at normal load-store speeds, while attempts that fail invoke appropriate software trap handlers through a fast trap mechanism. The software handlers deal with the issues of retrying versus blocking, queueing, and rescheduling. The e ciency of Alewife's mechanisms is analyzed by comparing the costs of various synchronization operations and parallel application execution time. In several applications we studied, our hardware support improved performance by 35%{50%.
منابع مشابه
Low Cost Support for Fine-Grain Sychronization in. . .
As multiprocessors scale beyond the limits of a few tens of processors, they must look beyond traditional methods of synchronization to minimize serialization and achieve the high degrees of parallelism required to utilize large machines. By allowing synchronization at the level of the smallest unit of memory, ne-grain synchronization achieves these goals. Unfortunately, supporting e cient ne-g...
متن کاملComparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared Memory
Symmetric multiprocessors (SMPs) connected with low-latency networks provide attractive building blocks for software distributed shared memory systems. Two distinct approaches have been used: the fine-grain approach that instruments application loads and stores to support a small coherence granularity, and the coarse-grain approach based on virtual memory hardware that provides coherence at a p...
متن کاملFine - grain Parallelism with Minimal Hardware Support : A Compiler - Controlled Threaded
In this paper, we present a relatively primitive execution model for ne-grain par-allelism, in which all synchronization, scheduling, and storage management is explicit and under compiler control. This is deened by a threaded abstract machine (TAM) with a multilevel scheduling hierarchy. Considerable temporal locality of logically related threads is demonstrated, providing an avenue for eeectiv...
متن کاملW.m. Zuberek: Performance of Fine-grain Multithreaded Multiprocessors Performance Analysis of Fine–grain Multithreaded Multiprocessors
Instruction–level multithreading is an architectural approach to tolerating long–latency memory accesses and synchronization delays in distributed–memory systems. The paper presents a timed Petri net model of a fine–grain multithreaded distributed–memory multiprocessor system at the instruction execution level, and illustrates performance analysis by results obtained from simulation of the deri...
متن کاملTime-Shifted Modules: Exploiting Code Modularity for Fine Grain Parallelization
Multi-threaded processors and chip-multiprocessors execute concurrent threads in close physical proximity, potentially reducing the cost of synchronization and communication significantly and enabling the parallelization of programs at a fine grain. In this paper, we explore a source of fine-grain parallelism present in programs due to their modular nature. Concurrency is derived from executing...
متن کامل